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Evaluating Prospective Teachers: Testing the Predictive Validity of the edTPA 

Dan Goldhaber, James Cowan, & Roddy Theobald 
CALDER Working Paper No. 157 
November 2016 


Abstract 

We use longitudinal data from Washington State to provide estimates of the extent to which 
performance on the edTPA, a performance-based, subject-specific assessment of teacher candidates, is 
predictive of the likelihood of employment in the teacher workforce and value-added measures of 
teacher effectiveness. While edTPA scores are highly predictive of employment in the state's public 
teaching workforce, evidence on the relationship between edTPA scores and teaching effectiveness is 
more mixed. Specifically, continuous edTPA scores are a significant predictor of student mathematics 
achievement in some specifications, but when we consider that the edTPA is a binary screen of teaching 
effectiveness (i.e., pass/fail), we find that passing the edTPA is significantly predictive of teacher 
effectiveness in reading but not in mathematics. We also find that Hispanic candidates in Washington 
were more than three times more likely to fail the edTPA after it became consequential in the state than 
non-Hispanic White candidates. 



I. Background: The Teacher Education Accountability Movement 

It is fair to say that teacher education prog rams a re facing significant scrutiny over the inservice 


performance of their graduates. About 75% of the roughly 100,000 novice teachers who enter the public 
school workforce each year are trained in a traditional college or university setting, and there is 
significant policy concern thatthe preparationthat prospective teachers receive is not adequateto 
ensure they are ready to teach on their first day in a classroom. Former Education Secretary Arne 
Duncan, for instance, stated that: "By almost any standard, many if not most of the nation's 1,450 
schools, colleges and departments of education are doing a mediocre job of preparing teachers for the 
realities of the 21st century classroom" (U.S. Department of Education, 2009). 

Given this environment, it is not surprising that there area number of new initiatives designed 
to hold teachereducation programs (TEPs) more accountable, eitherthrough direct measures of the 
training they provide teacher candidates or based on output measures, such as the value added of 
candidates who enter the teaching workforce. One of the ways that TEPs and states have responded to 
this increased accountability pressure is by adopting the edTPA, a performance-based, subject-specific 
assessment that is administered to teachercandidatesduring their student teaching assignment. There 
has been remarkably rapid policy diffusion of this assessment from its initial field testing in 2012 to full 
implementation (Gottlieb et a I., 2016): The edTPA is now used by over 600 TEPs in 40 states, and passing 
the edTPA is a requirement for licensure in seven states. 1 Yet despite the rapid adoption of this 
assessment, critics of the edTPA (e.g., Greenblatt & O'Hara, 2015) point out that there is limited large- 
scale research linking edTPA scores to outcomes for inservice teachers and their students. 


1 See http://edtpa.aacte.org 
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There are several theories of action for how teacher performance assessments like the edTPA 


might improve the quality of the teacher workforce. First, the edTPA can be used as a high-stakes screen 
and "provide a consistent standard for entry into the profession" (Hill et al., 2011); this is how the edTPA 
is currently used in statesin which the assessment isa requirementto participate in the labor market. 2 
This use of the edTPA requires predictive validity around the cut point adopted for labor market 
participation, which is set to different scores in different states through "standard setting conferences" 
described in edTPA (2015). 3 

The edTPA might also improve the quality of the teaching workforce by affecting candidate 
teaching practices. Indeed, the edTPA is described by its developers as an "educative assessment" that 
"supports candidate learning and preparation program renewal" (edTPA, 2015), and Hill et al. (2011) 
suggest that the teacher performance assessments like the edTPA could "describe expectations for 
novice teaching and set a trajectory of improvement over the developmental continuum." This could 
occur atthe individual teachercandidate level if, for instance, participation in the edTPA directly 
influences the teaching practicesof teacher candidates. Alternatively, this could occur atthe TEP level if, 
for instance, participation in the edTPA influences thetraining provided byTEPs. Finally, the edTPA 
might be used for hiring purposes; for instance, school systems might be more likely to hire teacher 
applicants with higher edTPA scores. Each of these potential mechanisms for workforce improvement 
requires that the edTPA provides a signal of quality teaching; i.e., that there is predictive validity away 
from the cut point such that differences in edTPA performance (at the candidate or institution level) 
might be indicative of teacherquality. 


2 For a full summary of edTPA participation across the country, see edTPA (2015), p. 13. 

3 Note that the existence of different cut points in different states means that the edTPA cannot be expected to have 
predictive validity “only” around a single cut point. 
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In this paper we use longitudinal data from Washington Statethat includes information on 


teacher candidates' scores on the edTPA to provide estimatesof the extent to which edTPA scores are 
predictive of the likelihood of entry into the teacher workforce and value-added measures of teacher 
effectiveness (i.e., predictive validity). Specifically, we test different theories of action for how the edTPA 
might improve the quality of the teacher workforce by considering the predictive validity of the edTPA 
as both a screen and a signal of future teacher effectiveness. 

Despite the fact that the edTPA was not consequential for some of the teacher candidates in our 
sample, we find that edTPA scores— both in terms of passing status and continuous scores— are highly 
predictive of the probability that a teacher candidate is employed the following year in the state's public 
teaching workforce. Evidence on the connection between performance and value-added measures of 
teacher effectiveness is more mixed. When we consider the edTPA as a binary screen of teaching 
effectiveness (i.e., pass/fail), we find that passing the edTPA is significantly predictive of teacher 
effectiveness in reading but not in mathematics. Continuous edTPA scores provide a signal of future 
teaching effectiveness in mathematics in some specifications, but are not statistically significant in 
reading. In both reading and mathematics, the relationship between continuous edTPA scores and 
teacher effectiveness is somewhat stronger for candidates who took the test after it became 
consequential in Washington, suggesting that the edTPA may provide a bettersignal of teacherquality 
when stakes are attachedtothe scores. 

We also find that Hispanic teachercandidates score farlower than non-Hispanic White 
candidates on the assessment. In fact, Hispanic candidates in Washington were more than threetimes 
more likely to fail the edTPAafterit became consequential in the statethan non-Hispanic White 
candidates (13.7% for Hispanic candidates comparedto 3.7% for non-Hispanic White candidates). This 
difference in passing rates strongly implies that the high-stakes use of the edTPA in Washington may 
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have an adverse impact on the diversity of the state'steacher candidate pool. However, it is important 


to be cautious about interpreting this as an effect on the diversity of the state's teacher workforce. It is 
possible, for example, thatteacherswho fail the test would be unlikely to obtain teaching positions in 
the absence of the edTPA requirement, or thatthe high-stakes use of the edTPAelicits other behavioral 
changes that affect who pursues a careeras a teacher. 

The rest of the paper proceeds as follows: In Section II, we provide additional information 
regarding teacher licensure and the edTPA in particular. We describe our data and analytic approach in 
Section III, present our findings in Section IV, outline some extensions in Section V, and offer concluding 
remarks in Section VI. 

II. Assessment of Prospective Teachers and the Role of the edTPA 

There are various ways thatteachercandidatesare typically assessed and judged to be 
eligible— that is, licensed— to teach in public schools. Licensure in many states requires that prospective 
teachersgraduatefrom an approved TEP and complete some preservice student teaching, althoughthe 
last decade has also seen an increased reliance on teachersentering the profession through state- 
approved alternative routes. Forty-nine of 50 states also require potential teachersto pass licensure 
tests that cover basic skills, content knowledge, and/or professional knowledge. 

The edTPA, by design, is quite different from traditional question-and-answer licensure tests: It 
is a portfolio-based, subject-specific assessment akin tothe National Boardfor Professional Teacher 
Standards (NBPTS) assessment of inservice teachers. The edTPA was initially developed by researchers 
at Stanford University's Center for Assessment, Learning, and Equity (SCALE) and has been further 
developed and distributed through a partnership between SCALE, the American Association of Colleges 
for Teacher Education (AACTE), and Evaluation Systems (a member-organization of the Pearson 
Education group). The edTPA was initially introduced in two large-scale field tests in 2011-12 and 2012- 
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13, and was "operationally launched" in 2013-14 (Pecheone et al., 2013). The edTPA relies on the 
scoring of teacher candidates who are videotaped while teaching three to five lessons from an 
instructional unit to one class of students, along with assessments of teacher lesson plans, student work 
samples and evidence of student learning, and reflective commentaries by the candidate. Candidates 
pay a $300 fee to take the edTPA and often take several months to prepare their portfolios for 
submission (e.g., Jette, 2014). 

The edTPA is a subject-specific assessment with different versions aligned with 27 different 
teaching fields (e.g., "EarlyChildhood", "Secondary Mathematics", etc.). 4 Eachofthese versions of the 
edTPA contains 15 different rubrics, each of which is scored on a 1-5 scale; the rubrics have equal 
weight so the range of possible summative scores (for tests with no incomplete rubric scores) is 15 to 
75. 5 The 15 rubrics that are used to calculate a candidate'ssummative score in Washington State are 
grouped into three areas: Planning (e.g., "Planning for Subject -Specific Understandings"); Instruction 
(e.g., "Engaging Studentsin Learning"); and Assessment (e.g., "Analysis of Student Learning"). 6 Teacher 
candidates in Washington State are also scored on three additional Student Voice rubrics (e.g., "Eliciting 
Student Understanding of LearningTargets"), which are designed to incorporate student -produced 
material into a teacher'sevaluation. For reasons discussed in the next section, these rubric scores are 
not currently used in computing a candidate'ssummative score. 7 


4 All analytic models presented in this paper control for test type, so compare outcomes only between candidates 
who took the same test type. 

5 Candidates may receive an incomplete score on any of the 15 rubrics for having technical is sues with the upload, 
uploading an incomplete file, having an edited video, or uploading material that is not related to the handbook. If a 
candidate received only one incomplete s core, it counts as a zero in the calculation of the final s ummative score; but 
the summative score is incomplete if the candidate receives an incomplete on two or mo re rubrics. 

6 W e perfonned a principal component analysis on the 15 rubric scores and found that the rubric scores load onto 
three factors thatalign closely with these areas. 

7 The national edTPA handbook for elementary education also includes three additional Mathematics Assessment 
rubrics (e.g., “Analyzing Whole Class Misunderstandings”) that havenotbeen adopted in W ashington State. 
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Proponents of theedTPA argue that the assessment and its precursors a re authentic 


measurement tools that can be used to predict teacher candidates' success in the classroom (e.g., 
Darling-Hammond et al., 2009; edTPA, 2015; Hill et al., 2011). While the edTPA is designed to assess 
individual teacher candidates, it is also thought to inform improvements in TEPs. Some states are, in 
fact, using the average edTPA performance of teacher candidates at an institution as a measure of 
institutional quality and/or in the accreditation process. In addition, the use of the edTPA is heavily 
promoted by AACTE, which touts the assessment as a means of improving "...the information base 
guiding the improvement of teacher preparation programs [and] strengthening] the information base 
for accreditation and evaluation of program effectiveness." 8 

Claims about the potential predictive validity of the edTPA are based on a small literature 
demonstrating that inservice teacher performance on portfolio-based assessments like the NBPTS 
assessment (Cantrell et al., 2008; Cowan and Goldhaber, 2016; Goldhaber and Anthony, 2015) and 
Washington State's ProTeach assessment (Goldhaber and Cowan, 2014) are predictive of teacher 
effectiveness, as well as two small-scale pilot studies of the edTPA's precursor, the Performance 
Assessment for California Teachers (PACT). 9 Specifically, Newton (2010) finds positive correlations 
between PACT scores and future value-added for a group of 14 teacher candidates, while Darling- 
Hammond etal. (2013) use a sample of 52 mathematicsteachersand 53 readingteachersand find that 
a one-standard deviation increase in PACT scores is associated with a .03 standard deviation increase in 
student achievement in either subject. 10 Beyond the factthatthese estimates are based on small sample 


8 See http ://edtpa.aac te.org/abou t-edtpa#Goals - 1 . 

9 The 2014 edTPA administrative report states that “Preliminary datafromstudies by Benner and Wishart(2015) 
has revealed that edTPA scores predict candidates’ ratings of teacher effectiveness, as measured by a composite 
score thatcombines students’ perfonnance data and classroom observations” (edTPA, 2015). However, these data 
have neverbeenpublished, and follow-up documentation ffomthe authors suggests thatthese relationships are more 
mixed than this quote suggests (personal communication, May 2016). 

10 Darling-Hammond et al. (2013) report nearly identicalpoint estimates as thosereportedin this paperbut with 
substantially more precis ion using a considerably smaller sample than is available in this paper. W e attempted to 
replicate their findings using differing assumptions regarding the appropriate level ofclusteringandcould only 
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sizes, however, there are several substantive differences between the edTPA and PACT in terms of 
scoring, implementation, and standards alignment. * 11 

As described in the next section, the administrative data we utilize for our research allows us to 
leverage a larger sample size of teachers (over 200 in both mathematics and reading) than the PACT 
studies cited above. Each of these teachers took the edTPA after its full national implementation in the 
2013-14 school year. It is importantto note, however, thatthe edTPA did not become consequential in 
Washington State until January 2014 12 , so candidates who failed the test in fall 2013 (as well as 
candidates who failed after January 2014 but subsequently re-took and passed the test) provide an 
opportunity to observe candidates who failed the test but still enteredthe public teaching workforce. 

While this study is one of the first to provide evidence on the validity of the edTPA as a measure 
of classroom performance, it is importantto distinguish the validity of the edTPA as an assessment of 
teaching practice from its efficacy as a teacher licensing tool. In particular, while validity is a significant 
prerequisite for using the edTPA to support effective licensure policy, extrapolating from these results to 
the effects of particular policies requires imposing additional assumptions beyond those that we test 
here. 

In particular, four features of common licensure policies limit such additional conclusions. First, 
licensure policies may change the population of potential teachers if candidates view the test as costly. 
There is some evidence from changes to state licensing provisions that licensure tests discourage some 
candidates with high academic achievement or outside wage offers from pursuing teaching as a 
profession, although evidence on overall effects on student achievement is mixed (Angrist & Guryan, 


estimate coefficients with similar levels ofprecision in models thatassume independent errors across students in the 
s ame clas sroom. W e attempted to compare modeling choices directly , but in dis cussions with the authors, we were 
unable to do so as they no longer have their data files (personal communication, Februaiy 2016). 

11 See http ://www.ctc .c a. gov/commis s ion/agendas/20 1 2-09/20 1 2-09-2F .pdf 

12 See http ://assessment.pesb.wagov/faa/edtpa-policies 


7 


2008; Larsen, 2015; Wiswall, 2007). Second, policies typically allow candidates to attempt the 
assessment multiple times. Inthe second ha If of the 2013-14 school year (when the edTPA was 
consequential), 4% of test takers failed the edTPA the first time they took it, but about half of these 
candidates eventually passed the test. Third, the matching of teacher candidates to teaching positions 
may provide additional screening beyond what is required by law. For example, it is not clearthat the 
small number of teachers in our sample who never pass the edTPA would obtain employment even in 
the absence of testing requirements. Finally, licensure systems like the edTPA might have system-wide 
effects on teacher quality. If participation in the edTPA raises overall performance, the signaling effects 
we estimate here may understate the overall effects of implementing testing requirements. The policy 
effects of national implementation of the edTPA, and similar authentic licensure assessments, therefore 
remains an important area for future research. 

III. Data and Analytic Approach 

1. Data 

Our research uses administrative data on teacher candidates provided by Washington State's 
Professional Educator Standards Board (PESB), as well as data on Washington State students, teachers, 
and schools maintained by the Office of the Superintendent of Public Instruction (OSPI). The PESB data 
includes scores on each individual edTPA rubric (as well as the final summative score) for all teacher 
candidates who took the edTPA in Washington State, not just those who ultimately are employed in the 
teacher workforce. As described in the previous section, the 15 rubrics used to compute the summative 
score can be combined into three subscores: Planning (rubrics 1-5), Instruction (rubrics 6-10), and 
Assessment (rubrics 11-15). 13 


13 The correlations between the three subscores range ffomO.598 to 0.661. 



Washington State participated in the edTPA field test in the 2012-13 school year (see Pecheone 


et al., 2013), and the PESB data include teachercandidate scores from this pilot year and two 
subsequent school years (2013-14 and 2014-15) after the full national implementation of edTPA. 
Becausethere were substantive changestothe assessment betweenthe pilot yearand full 
implementation (edTPA, 2015), and because inservice data are not yet available for teacher candidates 
who took the edTPA in 2014-15, our primary results focus on the 2,362 teachercandidatesfrom 
Washington State TEPs who took the edTPA in the 2013-14 school year. In most cases, we consider 
edTPA scores from each candidate's first test administration, although in cases where a candidate 
received an incomplete score and subsequently resubmitted his or her materialswithin a month, we 
disregard the initial incomplete score and consider a candidate's subsequent submission. 14 

We link these edTPA scores to data from OSPI that include test scores on other licensure tests 
that teacher candidates must also pass in order to be eligible to teach, such as the Washington Educator 
Skills Test-Basic (WEST-B), an assessment of basic skills in reading, writing, and mathematicsthat has 
been a requirement for admission into Washington StateTEPs since 2002. 15 Among teacher candidates 
in the edTPA sample, 60.29% entered the state's public teaching workforce in the 2014-15 school year 
(defined as being employed in a certificated teaching position), and for these 1,424 teacher candidates, 
the OSPI data also include information about their school assignments, race, gender, and ethnicity. 

For the subset of 277 teacher candidates who enterthe workforce and teach mathematicsor 
reading in Grades 4-8 (i.e., grades and subjects in which both current and prior test scores are available, 
or the value-added sample), we can investigate the relationship between edTPA performance and 
student achievement. Specifically, we observe annual student test scores in mathematicsand reading in 

14 We drop incomplete scores in cases where the candidate resubmits materials within a month of the score reporting 
date. W e experimented with models that consider all incomplete scores as failures and found similar results. 

15 Some alternative licensing exams may be submitted instead of taking the WEST-B. Thus, not allprospective 
teachers take the WEST-B (RCW 28A. 4 1 0.220 & W AC 1 8 1 -0 1 -002). 
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Grades 3-8 (also provided by OSPI) on the state's Measures of Student Progress (MSP) examination in 


2012-13 and 2013-14 and Smarter Balanced Assessment (SBA) in the 2014-15 school year. 16 We 
standardize these scores within grade and year and connect them to additional student demographic 
information (gender, race/ethnicity, special education status, free/reduced-priced lunch eligibility, and 
English learner status) and, through a unique link in the state'sComprehensive Education Data and 
Research System (CEDARS) data system, to data on the student's teachers in mathematics and reading 
(described above). 17 

Table 1 summarizes data for prospective teachers who took the edTPA assessment in 2013-14 
for all candidates (columns 1-6) and for candidates who appear in the teaching workforce in 2014-15 
(columns 7-12). Within each set of columns, we present summary statistics for all individuals within the 
group (columns 1 and 7) and by quintile of performance on the edTPA (columns 2-6 and 8-12). 18 In 
column 1, we see that the overall first -time pass rate on the test, 93.9%, was quite high because 
Washington State had set a low cut score of 35, but this passing rate would have been only 86.5% had 
the state used its future cut score of 40. 

The summary statistics for teacher candidates by quintile of performance on the edTPA 
(columns 2-6) make it clear that there is a correlation between edTPA performance and the WEST -B 
basic skills licensure tests that are required for entry into Washington State'sTEPs. 19 It is also 
immediately clearthatteacherswho perform betteron theedTPAare more likely to be employed in 


16 About one third ofW ashington State schools participated in the state’s Smarter Balanced Assessment pilot in the 
2013-14 school year, so test scores are not available in 2013-14 for students in these schools. We discuss our 
approach to these mis s ing data in the analytic approach s ection. 

17 CEDARS data includes fields designed to link students to their individual teachers, based onreported schedules. 
However, limitations ofreportingstandards and practices across the state may result in ambiguities or inaccuracies 
around these links . We limit the student sample to students who received instruction from a single teacher in that 
subject and year. 

18 Note that the quintiles in this table are based on edTPA scores across multiple test types; but all models include 
fixed effects fortesttype (socandidates are compared only with other candidates who to ok the same testtype). 

19 The correlations between continuous edTPAscores and the three WEST-B subtests are moderate (r=0.20 in 
mathematics andreading,r= 0.25 in writing). 
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Washington State's public schools in the subsequent year: Only 50.8% of first -quintile (lowest -quintile) 


teachers are observed teaching versus 64.6% of fifth-quintile (top quintile) teachers. We also observe 
large differences in performance between Hispanic and non-Hispanic White teachercandidates. 
Specifically, Hispanic candidates are about twice as likely to score in the lowest quintile of theedTPA as 
in the middle three quintiles and four times as likely to score in the lowest quintile as in the top 
quintile. 20 

We further explore the differences in edTPA performance by teacher candidate race/ethnicity in 
Table 2. Hispanic teachercandidates score significantly lower than non-Hispanic White candidates on 
the total score, all three subscores, and all fifteen individual rubrics. 21 Additionally, Hispanic candidates 
in Washington were more than three times more likely to fail the edTPA after it became consequential 
in the state than non-Hispanic White candidates: 13.7% of Hispanic candidates failed the test after 
January 2014, compared to 3.7% of non-Hispanic White candidates. 22 Although this difference in passing 
rates suggests that the high stakes use of theedTPA in Washington may adversely impact the state's 
teacher workforce diversity, we do not find that that first -year teachers in the 2014-15 school year (the 
year after the edTPA became consequential) are less diverse than in earlieryears; in fact, 7.39% of all 
first-year teachers in 2014-15 are Hispanic, compared to4.47% in 2013-14. It is also unclear whetherthe 
high-stakes use of theedTPA elicits other behavioral changes that affect who pursues a career as a 


20 This is consistent with research showing that perfonnance on licensure tests varies across teacher candidate 
subgroups (Goldhaberand Hansen, 2010). 

21 These results are robust to controlling for candidate TEP(i.e., Hispanic candidates are more likely to fail the 
edTPA than non-Hispanic White candidates within the same TEP), and conflict with recent evidence (edTPA, 2016) 
from a national census of edTPA test-takers that finds Blackteacher candidate s cores to be s ignificantly lower than 
the scores of White candidates, but no significant difference between White andHispanic teacher candidate edTPA 
perfonnance. 

22 Hispanic teachercandidates are also considerably mo re likely than White candidates to score lower than a 40 (the 
state’s future cut score), though we can not necessarily conclude that this difference in passing rates would hold 
underfills newcutscore. 
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teacher, or whether Hispanic teachers may be more likely to receive emergency credentials to teach in 


high-needs areas like in English Language Learner programs. 

2. Analytic Approach 

To investigate the relationship between edTPA scores and the probability of workforce entry, we 
first define p jkt as the probability that teacher candidate j who took edTPA test type k in 2013-14 
appears as a Washington State public school teacher in the 2014-15 school year and estimate a simple 
logit model for all 2,238teachercandidates in the sample: 

log = “o + «i TPA jk + a k + Ej k (1) 

In the base specification of the model in equation 1, TPA jk is a binary variable indicating whether 
teacher candidate j passed the edTPA on the first test sitting. Given that all specifications include fixed 
effects for test type k, all coefficients can be interpreted as relative to other teacher candidates who 
took the same test type. 23 Although the coefficient of interest a x is on the log odds scale, we present all 
estimates as average marginal effects. We also estimate three other specifications of the model in 
equation 1 in which: (1) TPA jk is an indicator for whethercandidate j would have passed the edTPA atthe 
state'sfuture (and higher) cut score; (2) TPA jk is a continuous variable indicating the edTPA score of 
candidate/ (standardized relative to all test takers); and (3) TPA Jk is a vector of scores for candidate j 
across the three subscores on the test (each standardized relative to all test takers). 

To investigate the predictive validity of the edTPA in terms of predicting the achievement of a 
teachercandidate'sfuture students, we estimate value-added models (VAMs) intended to separatethe 
impact of teacher characteristics (such as edTPA scores) from other variables that influence student test 
performance (see Koedel etal. [2015] for review). Specifically, we estimate variantsof the following 

23 As discussed in section II, there are 27 different vers ions of the edTPA, so this ensures that candidates are only 
compared to other candidates who completed thesame testtype. 
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VAM only for the candidates who enter the teaching workforce and are linked to current and lagged 


student achievement data (204 in reading, 206 in mathematics): 

Y ijgkst = A) + Pl Y i,t-l + P2 X it + /?3 C ist + @4 Z jt + P5 TPA jk + Pg + Pk + £ ijkgst ( 2 ) 

In equation 2, Yij gkst is the SBA score of student / in grade g, subject s, and year f (the 2014-15 
school year for all students), while in the classroom of teacher j who took edTPA test type k. Y it _ 1 is a 
vector of student i's prior year test scores in mathematics and reading. The student test scores in both 
Yijgst and Y t t _ 1 are standardized by test, grade, and year across all test takers. Therefore, the units of 
the coefficients on the right side of equation 2 are standard deviations of student performance (relative 
to other scores on the same test in the same grade and year). X it is a vector of student covariates for 
student i, in year t, which includes indicators for race/ethnicity, gender, free or reduced-priced lunch 
eligibility, gifted/highly capable, limited English proficiency (LEP), special education, and learning 
disabled. C ist is a vector of aggregated student characteristics in the student's classroom, while Z Jt an 
indicator for whether or not a teacher possesses an advanced degree in year t . 24 All specifications 
include fixed effects for grade g and test type k, so all results can be interpreted as relative to other 
students in the same grade whose teachers took the same edTPA test type. 

The different specifications of the model in equation 2 correspond to the different theories of 
action discussed in the introduction. When we investigate the edTPA as a screening mechanism 
intended to prevent low-performing teachersfrom entering the workforce, TPA jk is an indicator for 
whether candidate j passed the edTPA on the first test administration (or, in a related specification, 
would have passed the edTPA at the state's future cut score). When we investigate the signal value of 
edTPA scores (i.e., the extentto which a candidate'sscore could be used as a proxy for future teaching 

24 Note that we do not need to controlforteachingexperiencebecause every teacher in the VAM sample is a first- 
yearteacher. 
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effectiveness), TPA jk \s the standardized edTPA score of candidate; (or, in a separate specification, a 
vector of standardized scores for candidate j across the test's three subscores). 

We estimate specifications with only test type fixed effects (the most parsimonious model in 
which teachers are compared to other teachers who took the same test type), with test type and TEP 
fixed effects (in which teachers are compared to other teachers who took the same test type and 
graduated from the same TEP), and with test type and school district fixed effects (in which teachersare 
compared to other teachers who took the same test type and are teaching in the same school district). 25 
We estimate equation 2 by ordinary least squares (OLS) and cluster standard errors at the teacher level 
to account for correlation betweenthe errors of students taught by the same teacher. 

One challenge in estimating all of these specifications is that approximately one-third of 
students in Grades 4-8 have missing prior-year test scores because theirschool participated in 
Washington State's Smarter Balanced Assessment pilot in the 2013-14 school year (and the state did 
not collect their scores). We therefore estimate three types of models: (1) a listwise deletion model that 
drops all students with missing prior-year test scores (possible in Grades4-8); (2) an imputation model 
that uses twice-lagged test scores to impute lagged test scores for students with missing test scores 
(possible in Grades 5-8); and (3) a stacked model that considers any student with either once-lagged 
scores, twice-lagged scores, or both and uses missing-value dummies to account for missing data 
(possible in Grades 4-8). We present primary results from the stacked models because they are based 
on the largest sample sizes, but estimates from the other models show that the results are not sensitive 
to these sample considerations. 26 


25 We also experiment with school fixed effects models, but a relatively small number of teachers in the VAM 
sample teach in the same schoolas compared with otherteachers who tooktheedTPA. 

26 These results are provided in AppendixTables A2-A5. 
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The broader VAM literature (e.g., Chettyet al., 2014; Kane et al., 2013) suggests that the VAMs 


described above account for the potential non-random sorting of students to teachersin the sample. A 
second concern, however, is the potential for sample selection bias. As is the case with other licensure 
tests, sample selection is a concern if teacher characteristics not captured by theedTPA are relevant for 
hiring decisions and contribute to teacher effectiveness. The literature on teacher hiring suggests that 
this is likely to be the case. For example, administrative and survey evidence suggeststhat references, 
interviews, and personality traitsare important predictors of employment outcomes, and that several of 
these measures are related to student achievement (Goldhaber et al., 2014a; Harrisand Sass, 2014; 
Jacob et al., 2016; Rockoff et al., 2011). Consequently, teachers who perform poorly in the domains 
measured by the edTPA but who appear in our sample are likely hired because they possessed some 
compensating skill or skills that make them more effective teachers. In other words, the candidates we 
observe with low scores are probably disproportionately high-performing teachers. 

We explore this issue empirically in Section IV below, but we arguethattwo factors are likely to 
limit the selection bias in our application. First, we examinethe edTPA ata time when it was not fully 
binding in Washington State. Giventhe lower cut score and the ability of failing teachercandidatesto 
retakethe assessment, the selection probabilities between initial passing candidates and initial failing 
candidates are not as substantial as they would be if the testing requirement was fully binding. 

Second, while non-tested teacher skills appear related both to hiring decisions and to teacher 
effectiveness, this relationship is not particularly strong. For example, analyses of the kinds of subjective 
data available to hiring authorities suggestthat, when combined with observable and objective 
measures of teacher skill, these measures explain only 10% to 20% of the variation in teacher 
effectiveness (Goldhaber etal., 2014a; Jacob etal., 2016; Rockoff et al., 2011). Results from Jacob et al. 
(2016) suggest a similar relationship tothe probability thata candidatefor a position is hired. 
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IV. Results 

In this section, we describe our primary research findings on the extent to which edTPA scores 
predict: The likelihood of being in the Washington State public teacher workforce (Table 3 and Figure 2); 
teacher effectiveness in reading (Table 4 and Figure 3); and teacher effectiveness in mathematics (Table 
5 and Figure 4). Before discussing our primary findings, however, a few peripheral findings are worth 
brief mention. 27 In terms of predicting employment in the Washington State teacher labor market, we 
find both that individual TEPs are associated with different probabilities of employment a nd that 
candidates who took the edTPA in a STEM area are more likely to be employed than are candidates who 
took an elementary edTPA assessment. Bothfindings echo earlier results from Goldhaber et al. (2014b). 

When estimating student achievement models, we find that underrepresented minority 
students (black and Hispanic), participants in the free and reduced-price lunch program, and students 
with reported learning disabilities score lower than their reference groups, all else equal. The 
magnitudes of these findings are quite similar to what has previously been found in Washington State 
(e.g., Goldha beret al., 2013) and other states (e.g., Rivkin etal., 2005). Similar to the employment 
models, TEPs explain a significant portion of student achievementgains in both mathematicsand 
reading. This finding is similar to evidence from Washington State and other states in terms of the 
variation in teacher effectiveness that can be attributed to TEPs (Goldhaber et al., 2013; Boyd et al., 
2009; Mihaly et al., 2013). 

1. edTPA as Predictor of Workforce Entry 

Table 3 reports several specifications of models predicting the likelihood of being employed in 

the Washington State public school teacher labor market the year after a candidate takes the edTPA 
assessment (see equation 1 above). All coefficients are reported as average marginal effects; so the 

27 The coefficients we discuss are not reported in the tables but are available fromthe authors upon request. 
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estimate in column 1, for example, means that teacher candidates who passed the edTPA at the 
Washington State cut score are 15.2 percentage points more likely to enter the public teaching 
workforce than areteachercandidates who failed the edTPA atthe Washington Statecut score, all else 
equal (i.e., compared with other candidates who took the same test type). The estimated marginal 
effect is somewhat smaller when candidates are compared with other candidates from the same TEP 
(column 2), and when we consider candidates who would have passed the test atthe future Washington 
State cut score (columns 3 and 4). These relationships are not surprising given that passing the edTPA is 
a licensure requirement for some candidates in our sample. Not surprisingly, these relationships are 
even stronger when we restrict the sample only to teacher candidates who took the edTPA after it 
became consequential. 28 

Columns 5-8 consider continuous measures of edTPA performance as predictors of workforce 
entry. These continuous scores are standardized across all test takers, so the average marginal effect in 
column 5 means that a one standard deviation increase in a candidate'sedTPA score is associated with a 
5.9 percentage point increase in the probability that an average teacher candidate is employed in the 
teacher workforce the following year. Columns 7 and 8 report specifications in which the three 
subscores of the edTPA are separately included in the model and show that the positive relationship 
between the total score and the likelihood of being in the labor market is driven largely by the 
assessment and instruction subscores. When we consider quintiles of edTPA scores, we find that scoring 
in the top quintile of the edTPA is associated with a 14 percentage point increase in the probability that 
a candidate will be employed in the following year, as compared with a candidate who scored in the 
bottom quintile. 


28 Thes e res ults are rep orted in Tab le A 1 in the appendix. 
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To help visualize the relationship between edTPA scores and the probability of teaching 


employment, Figure 2 plots the observed probability of employment associated with each edTPA score, 
along with a polynomial best-fit line. 29 Two patternsare worth noting. First, the relationship between 
edTPA scores and probability of employment is relatively steep and linear in the lower range of edTPA 
scores— with no discontinuity at the current passing score of 35— suggesting that, at least at the lower 
end of the distribution, continuous edTPA scores reflect some candidate traitor traitsthat are predictive 
of employment. Second, the relationship is much weakerin the upper rangeof the distribution of edTPA 
scores, which means that the probabilities of employment are similar for candidates within the range of 
relatively high edTPA scores. 

Although the results in Table 3 and Figure 2 demonstrate a strong relationship between edTPA 
scores and the probability that a teacher candidate is employed in Washington State's K-12 public 
teaching workforce, it is not possible to disentangle preferences of teacher candidates and employers in 
interpreting these findings. As noted above, some districts may use edTPA to help them decide among 
teacher applicants. On the teacher candidate side, moreover, these findings may reflect the fact that 
more dedicated teacher candidates perform better on the assessment and are also more likely to enter 
the profession. 

2. edTPA as a Screening Mechanism 

Columns 1-6 of Tables 4 and 5 summarize the relationship between passing the edTPA (either at 
the current Washington State cut score of 35 or future cut score of 40) and teacher effectiveness in 
reading or mathematics, respectively. We estimate these screening models using data from the 
classrooms of teachersemployed in the year following their edTPA administration. Given that many 
teacher candidates do not find teaching positions and that only a minority of teachers work in tested 

29 The best fit line is estimated froma logit at the teacher candidate level, with the order ofpolynomial chosen to 
minimize the AIC of the regression. 
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grades and subjects, this is a necessarily small subset of the total number of teacher candidates sitting 


for the edTPA. We may therefore worry that such selection biases our results. The concern is that 
teachers who perform poorly on the edTPA but still obtain teaching positions likely have other skills that 
are valued in the workplace but are not observed in our data, suggesting that the coefficients reflecting 
the relationship between edTPA performance and teacher effectiveness are biased downward; i.e., a 
lower bound on the true relationship. As discussed in the previous section, there are good reasons to 
believe that sample selection bias is a minimal concern, but this motivates the bounding exercise 
described later in this section. 

The models in Tables 4 and 5 correspond to equation 2, and include lagged test scores and other 
student background controls (the specific independent variables used in each model specification are 
reported in notes below the table), but they exclude other teachercandidate variables as we are 
focused only in assessing the pass/fail screening value of the edTPA assessment. However, the 
coefficients in Tables 4 and 5 change very little when the models include additional teacher controls 
(such as WEST-B scores). We also note that results are very consistent between the primary 
specifications reported in Tables 4 and 5 and the more conservative specifications that either only use 
students with non-missing prior yeartest scores or non-missing twice-lagged test scores. 30 

Column 1 of Table 4 demonstratesthat teacher candidates who pass the edTPA at the 
Washington State cut score are more effective in reading instruction, all else equal, than teacher 
candidates who fail the edTPA on theirfirst test administration. Specifically, students assigned to a 
teacher who passed the edTPA score 0.252 standard deviations higher, all else equal, than students who 
failed the edTPA. This relationship is large and statistically significant in all specifications— i.e., 
comparing candidatesto other candidates from the same TEP (column 2) or who teach in the same 

30 Thes e results are reported in tables A2-A5 in the appendix. 
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school district (column 3)— and are more modest but still statistically significant when we consider 
whethercandidates would have passed thetest atthe future Washington State cut score. We interpret 
these results as suggesting that the edTPA has strong predictive validity in reading as a screen at these 
cut points. Our point estimatesfor the edTPA screening effect in mathematics in columns 1-6 of Table 5, 
on the other hand, are smaller and generally statistically insignificant. Although positive in all 
specifications, the screening coefficient in mathematics is statistically significant in only one 
specification (column 5). 31 

The differences between the screening coefficients in reading and the corresponding 
coefficients in math are statistically significant, and these differences are reflected in Figures 3 and 4, 
which plot estimatedteachervalue added and edTPA test scores for all teachersin our sample. The lines 
plotted in these figures show local linear estimates of the relationship betweenteachervalue added and 
edTPA test scores. 32 While these figures do not control for candidate test type (and thus candidates are 
being compared to all other candidates regardless of test type), they illustrate that candidates who fail 
the edTPA at the current Washington State cutoff (35) and future Washington State cutoff (40) tend to 
be considerably less effective in reading (Figure 3), but less so in mathematics (Figure 4). The predicted 
effectiveness in reading increases sharply before the cut points, but predicted effectiveness in 
mathematicschanges relatively little in this same range. As demonstrated by the scatter plot, we 
observe a smaller number of teachers with failing scores in the reading sample than in the mathematics 
sample and these teachers are more likely to have low value added. 33 


31 As shown in Tables A6 and A7 in the appendix, the screening results are similar when we estimate models only 
for candidates who took the edTPA after it became consequential. The differences between the screening results 
before and afterthe edTPAbecame consequential are not statistically significant. 

32 We estimate teacher value added using the same specification as equation 2, but omitting the edTPA scores and 
teachercontrols. We then estimate local linear regressions ofestimated teachervalue addedon edTPAscores using 
the np package in R(Hayfield & Racine, 2008). 

33 In order to obtain an estimate of the potential magnitude of sample selection bias in these estimates, we conduct a 
bounding exercise in the spirit of Lee (2008). Our results suggest the point estimates for the screening effect lie 
between 0.05 and 0.40 for reading andbetween -0.09 and 0.09 in mathematics . Results available ffomthe authors 
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3. The Signal Value of edTPA Performance 

The value of the edTPA as a signal of teacher quality is an important policy issue. Recallthat the 

edTPA is described as an "educative assessment/' and this is much more plausible if there is predictive 
validity to the assessment awayfrom the cut point (suggesting that changes in performance by 
candidates or institutions are indeed predictive of teacher effectiveness). Additionally, whether inservice 
teachers with higher edTPA scores are more effective is an important policy question given that school 
systems may wish to consider an applicant's edTPA scores in making hiring decisions. 

Columns 7-12 of Tables 4 and 5 report the estimated relationships between continuous 
measures of candidate edTPA performance and student achievement in reading and mathematics, 
respectively. Columns 7-9 of Table4 illustrate thatwe find little evidence that edTPA scores throughout 
the distribution are predictive of teacher effectiveness in reading. Specifically, the coefficient in column 
7 means that a one standard deviation increase in a candidate's edTPA score is correlated with a 0.02 
standard deviation increase in student performance in the candidate's classroom in his or her first year 
teaching, but this relationship is not statistically significant. The weak relationship between continuous 
edTPA scores and teacher effectiveness in reading is reflected in Figure 3, as there is little increase in 
predicted teacher effectiveness within the range of passing scores (i.e., above a 40). We note, however, 
that this relationship is positive and statistically significant when we focus solely on candidates who took 
the edTPA after it became consequential in January 2014. 34 

On the other hand, columns 7-12 of Table 5 provide some evidence that edTPA scores provide a 
signal of future teacher effectiveness in mathematics. 35 Specifically, when candidates are compared 


upon request. 

34 See Table A7 in the appendix. The difference between this relationship before and after the edTPA became 
consequential is not statistically significant. 

35 Note, however, that the differences between the signal coefficients in math and the corresponding coefficients in 
reading are not statistically significant. 
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across TEPs and districts (column 7), a one standard deviation increase in a candidate's edTPA score is 


correlated with a 0.03 standard deviation increase in student performance in the candidate'sclassroom 
in his or herfirst yearteaching, and this relationship is marginally statistically significant. This is reflected 
in the generally positive slope of the local-linear fit line in Figure 4. 

The relationship between edTPA scores and mathematics teaching effectiveness is stronger 
when candidates are compared to other candidates from the same TEP (column 8), but weaker when 
candidates are compared to other candidates who are teaching in the same school district (column 9). 

As discussed in Goldhaber et al. (2013), it is possible that the district fixed effects in the model in column 
9 capture district-level effects that a re attributed to teachers in theestimates reported in columns 7 and 
8; but it is also possible that these effects remove average differences in teacher quality among different 
school districts that should be attributed to teachers. Given that we cannot distinguish between these 
possibilities, we simply conclude thatthe predictive validity of the edTPA as a signal of future teaching 
effectiveness in mathematics is strongerwhen comparisons are made across districts ratherthan within 
districts. 

Finally, columns 9-12 of Table 5 consider the three edTPA subscores as joint predictors of 
teacher effectiveness in mathematics, and suggest that candidate performance on the Planning rubrics 
are driving the relationships in columns 7-9. This is an interesting finding, as the Planning subscores 
were less predictive of the probability of employment than werethe other two subscores (see Table 3). 

V. Policy Implications 

In this study, we find that teachers failing the edTPA under the future Washington State passing 
threshold have lower value added in reading than teachers who passed the test at this cut score. We 
find no statistically significant difference between those who pass and those who fail in mathematics, 
although changes in the assessment score are predictive of teacher performance. These results 
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generally hold when a licensure test of candidates' basic skills is included in the model, which suggests 


that portfolio-based assessments such as the edTPA contain information about teaching practice that is 
not captured by these basic-skills tests. Although our point estimatesare imprecisely estimated due to 
the small samples employed in this study, the magnitudes of the signal estimatesare roughly similar to 
those observed in studies of other licensure tests (Goldhaber, 2007; Clotfelter etal., 2007; Darling- 
Hammond etal., 2013). 

In order to put the results in perspective, we estimate the probability that a teacher candidate 
failing the edTPA is a low-performing teacher (defined as being in the bottom 20% of value added) or a 
high-performing teacher (defined as being in the top 20% of value added). 36 The results of this test are in 
Table 6. If the passing the edTPA provided no predictive power for value added, we would expect 20% of 
teachers who fail the test to be in each of these categories. Not surprisingly given the null screening 
results in mathematics, we find that 19% of mathematicsteacherswho fail the edTPA are in the low- 
performing category. On the other hand, we find that 46% of reading teachers who fail the edTPA are in 
the low-performing category, far higher than the 20% we would expect by chance alone. That said, if the 
edTPA really were used as a one-time, high-stakes test for employment eligibility, screening these 
candidates who would become ineffective teacherscomes atthe cost of screening out some candidates 
who would become effective teachers. Specifically, 8% of reading teachers and 14% of math teachers 
who fail the edTPA are in the high-performing category (top 20% of value added); neither of these 
proportions is statistically different thanthe 20% we would expect by chance. 

We can more simply summarize these proportions using the "number needed to treat." In 
medicine, the number needed to treat is the average number of patientsthat would need to be assigned 
an intervention to avoid one additional adverse outcome. A low number needed to treat indicates an 

36 We obtain similar results if we instead estimate these conditional probabilities using the simulation method 
suggestedby Jacob and Lefgren(2008). 
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efficient intervention as it implies that a greater number of patients benefit. In this case, we can identify 


the number of test -takers needed to screen out a lowest quintile teacher. We do a back-of-the-envelope 
calculation suggesting that the edTPA identifies one bottom quintile reading teacherfor every 17 
assessed candidates, while it identifies one bottom quintile mathematicsteacherfor every 39 
candidates. Put another way, this suggestsa cost in exam fees to candidatesof $5100 to identify an 
ineffective reading teacher and $ll,700to identify an ineffective mathematicsteacher. 

VI. Conclusion 

Given thatthis is the first predictive validity study of the edTPA, and given the nuanced findings 
we describe above, we are hesitant to draw broad conclusions about the extent to which edTPA 
implementation will improve the quality of the teacher workforce. Instead, we relate our findings back 
to the different theories of action for how the edTPA might improve teacher workforce quality, but we 
stress that even these conclusions come with important caveatsand trade-offs that policymakers and 
teacher educators should weigh as they interpret these results. 

The first theory of action is that the edTPA can be used as a screen to prevent ineffective 
teacher candidates from entering the workforce. The screening results in reading— demonstrating 
predictive validity around the current and future Washington State cut points used for licensing 
decisions— generally suggest that this theory of action is promising in terms of improving overall 
workforce quality in reading. But as we discuss in the previous section, this screening comes at a cost, as 
candidates who fail the edTPA but become high-performing teachers will also be screened out of the 
workforce. We do not find evidence of a screening effect in mathematics, although our estimates are 
imprecisely estimated. This relationship may, in part, be caused by the edTPA's focus on candidates' 
writing capacities, which may be more related to a teacher'sability to teach reading than 
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mathematics. 37 It is also important to recognize that the screening theory of action is predicated on 
teacher candidates failing the assessment. It is unclear that this screening theory of action can actually 
work in a setting in which candidates are able to take the test multiple times in order to pass, as the 
ability of the assessment to predict teacher effectiveness is likely to be low for candidates with multiple 
retakes(Cowan and Goldhaber, 2016). 

The second theory of action is that the edTPA could improve the quality of all teaching 
candidates through the experience of the assessment or programmatic changes that are related to 
information TEPs receive about teacher candidate performance. This is much more likely if the edTPA 
scores can serve as a signal of quality teaching beyond just at the cut point required to participate in the 
labor market. In this case, it is the modest but statistically significant results in mathematicsthat suggest 
promise for this theory of action and the weaker results in reading that suggest caution. That said, the 
extent to which the edTPA can "support candidate learning and preparation program renewal" (edTPA, 
2015) likely depends on the ability of TEPs to create feedback loops that allow candidate performance 
on the edTPA to influence the training they provide. Moreover, policymakers and teacher educators a Iso 
need to weigh these results against the possibility that the high-stakes use of the edTPA may adversely 
affectthe diversity of the teacher workforce, given the large differences betweenthe passing rates of 
White and Hispanic teachercandidatesin Washington. 

We believe there are a number of potential next steps that are not possible to pursue with the 
data used in this study but that would be valuable to policymakers and teacher educators. One is to 
investigate the degree to which the different rubric scores within the edTPA might be reweighted (or 
modified) in order to increase the relationship between summative edTPA scores and student 
achievement or teacher value added. The samples in Washington State are currently insufficient for 

37 For example, edTPA scores are more highly correlated with WEST -B writing scores 0=0.25) than WEST -B 
reading ormathematics scores (r=0.20). 
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optimal weighting exercises (e.g., Goldhaberet al., 2014a), but such exercises are possible with 
additional years of data and/or data from other statesand would be valuable to TEPs looking to 
prioritize different aspects of their training of teacher candidates. A second next step might be to assess 
how edTPA scores are related to other, broader measures of teacher performance, such as 
observational ratings. This is not currently possible using Washington State'sadministrative data; but it 
may be possible elsewhere. Finally, given concerns about the fairness of teacher observations across 
classroom contexts (Steinberg and Garrett, 2016) and recent calls to place more student teachers in 
disadvantaged schools (Krieg etal.,2016), policymakers would benefit from evidence about whether 
edTPA scores vary substantially across teacher candidates in different kinds of student teaching 
positions. 

A final caveatto these conclusions— and an essential issue for policymakers and teacher 
educators to weigh in interpreting these results— is whether the results we reference above justify the 
investments that candidates, states, and TEPs have made in the edTPA. While the monetary costs 
associated with the edTPA are easily quantifiable (e.g., $300 per teacher candidate), there are also less 
easily quantifiable time-commitment costs for both candidates and programs. We know very little 
regarding whetherthese costs might affect the pool of people who seek to become teachers. We 
therefore view the interpretation of these results as very much in the eye of the beholder, and we hope 
this early analysis spurs an evidence-based discussion about the potential promise and drawbacks of 
edTPA implementation. 
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Tables 


Table 1. Summary Statistics 

Teacher candidate sample Teacher sample 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 


All 

Ql 

Q2 

Q3 

Q4 

Q5 

All 

Ql 

Q2 

Q3 

Q4 

Q5 

Total score 

46.339 

37.272 

44.100 

47.045 

50.398 

56.204 

46.960 

37.537 

44.122 

47.035 

50.368 

56.240 

(6.960) 

(4.535) 

(0.804) 

(0.813) 

(1.133) 

(3.069) 

(6.726) 

(4.398) 

(0.797) 

(0.814) 

(1.124) 

(3.003) 

Planning sub score 

15.924 

12.962 

15.218 

16.138 

17.284 

19.102 

16.102 

13.088 

15.135 

16.095 

17.243 

19.113 

(2.644) 

(2.216) 

(1.234) 

(1.256) 

(1.410) 

(1.624) 

(2.584) 

(2.182) 

(1.297) 

(1.304) 

(1.458) 

(1.628) 

Instruction subscore 

15.460 

12.869 

14.783 

15.515 

16.473 

18.623 

15.661 

12.974 

14.857 

15.574 

16.465 

18.579 

(2.416) 

(1.771) 

(1.203) 

(1.121) 

(1.389) 

(1.678) 

(2.334) 

(1.733) 

(1.133) 

(1.132) 

(1.395) 

(1.642) 

Assessment subscore 

14.922 

11.395 

14.055 

15.361 

16.605 

18.473 

15.165 

11.432 

14.094 

15.327 

16.627 

18.542 

(2.948) 

(2.317) 

(1.293) 

(1.286) 

(1.291) 

(1.638) 

(2.896) 

(2.317) 

(1.317) 

(1.350) 

(1.287) 

(1.603) 

% Passing WA score (35) 

0.939 

0.743 

1.000 

1.000 

1.000 

1.000 

0.951 

0.759 

1.000 

1.000 

1.000 

1.000 

% Pass ing future s core (40) 

0.865 

0.436 

1.000 

1.000 

1.000 

1.000 

0.889 

0.456 

1.000 

1.000 

1.000 

1.000 

WEST-B Reading 

271.016 

266.396 

271.200 

271.656 

272.459 

274.659 

271.482 

265.681 

271.269 

271.901 

273.364 

275.667 

(16.139) 

(17.522) 

(16.709) 

(15.307) 

(14.961) 

(14.221) 

(15.997) 

(17.806) 

(17.118) 

(14.711) 

(14.623) 

(13.276) 

WEST-B Writing 

264.340 

(18.049) 

257.273 

(19.124) 

264.049 

(17.832) 

265.408 

(17.259) 

267.332 

(15.523) 

269.685 

(17.224) 

264.910 

(17.550) 

256.710 

(18.740) 

264.166 

(17.113) 

266.214 

(16.468) 

267.905 

(15.455) 

270.266 

(16.578) 

WEST-B Math 

279.548 

274.924 

280.000 

280.233 

281.775 

282.064 

280.093 

274.978 

280.261 

280.838 

282.431 

282.357 

(17.649) 

(20.092) 

(16.231) 

(16.797) 

(15.768) 

(17.452) 

(17.288) 

(18.889) 

(16.128) 

(15.973) 

(15.946) 

(18.206) 

Female 

0.764 

0.745 

0.732 

0.785 

0.792 

0.779 

0.768 

0.762 

0.721 

0.791 

0.773 

0.797 

White 

0.785 

0.756 

0.778 

0.796 

0.785 

0.817 

0.779 

0.759 

0.755 

0.789 

0.791 

0.804 

Asian 

0.045 

0.028 

0.056 

0.040 

0.058 

0.049 

0.050 

0.033 

0.063 

0.042 

0.060 

0.054 

Black 

0.012 

0.011 

0.014 

0.011 

0.014 

0.012 

0.015 

0.010 

0.022 

0.011 

0.020 

0.014 

Hispanic 

0.060 

0.114 

0.040 

0.047 

0.058 

0.028 

0.063 

0.127 

0.050 

0.053 

0.050 

0.034 

Multi-race 

0.046 

0.039 

0.044 

0.063 

0.044 

0.044 

0.044 

0.026 

0.041 

0.067 

0.040 

0.047 

Entering workforce 

0.602 

0.508 

0.599 

0.615 

0.674 

0.646 

1.000 

1.000 

1.000 

1.000 

1.000 

1.000 

Reading VAM sample 

0.088 

0.058 

0.086 

0.121 

0.095 

0.091 

0.139 

0.107 

0.135 

0.190 

0.136 

0.132 

Math VAM sample 

0.087 

0.072 

0.084 

0.110 

0.095 

0.077 

0.137 

0.134 

0.132 

0.173 

0.136 

0.111 

Teacher candidates 

2376 

569 

501 

447 

432 

427 

1508 

307 

319 

284 

302 

296 


NOTE: Weomit summary statistics for American Indian, AlaskanNative, and Other/Unspecified race candidates due to small cell sizes. 413 teacher candidates and 185 
teachers are missing WEST-B scores, with the distribution of mis sing scores relatively unifonn across quintiles. Standard deviations of continuous variables in parentheses. 
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Table 2. edTPA Performance by Teacher Candidate Race and Ethnicity 


Teacher candidate sample 



Overall 

Asian 

Black 

White 

Hispanic 

Multi-race 

Total score 

46.339 

47.602+ 

46.966 

46.544 

42.972*** 

46.582 

(6.960) 

(6.379) 

(5.698) 

(6.901) 

(7.747) 

(6.454) 

Planning subscore 

15.924 

16.389+ 

16.034 

15.966 

14.913*** 

16.250 

(2.644) 

(2.537) 

(1.927) 

(2.618) 

(2.930) 

(2.530) 

Instruction s ubscore 

15.460 

(2.416) 

15.593 

(2.173) 

15.983 

(2.230) 

15.515 

(2.427) 

14.559*** 

(2.448) 

15.445 

(2.305) 

Assessment subscore 

14.922 

15.583* 

14.845 

15.031 

13.451*** 

14.868 

(2.948) 

(2.666) 

(2.435) 

(2.910) 

(3.452) 

(2.698) 

Overall % Passing WAscore (35) 

0.939 

0.972+ 

1.000*** 

0.943 

0.839** 

0.964 

Passingscore (35): Pre -Consequential 

0.876 

0.812 

1.000*** 

0.878 

0.780 

1.000*** 

Passingscore (35): Post-Consequential 

0.958 

1.000*** 

1.000*** 

0.964 

0.863** 

0.956 

Overall % Pas s ing future s core (40) 

0.865 

0.926+ 

0.897 

0.874 

0.706*** 

0.864 

Passingscore (40): Pre -Consequential 

0.769 

0.625 

1.000*** 

0.792 

0.634+ 

0.700 

Passingscore (40): Post-Consequential 

0.895 

0.978*** 

0.889 

0.901 

0.735*** 

0.900 

Planning: Planning for subject specific 

3.250 

3.361 

3.190 

3.263 

3.010*** 

3.341 

understanding 

(0.658) 

(0.673) 

(0.451) 

(0.653) 

(0.670) 

(0.628) 

Planning : Planning to s upport varied 

3.199 

3.278 

3.328 

3.201 

3.052* 

3.305 

learning needs 

(0.724) 

(0.635) 

(0.602) 

(0.726) 

(0.738) 

(0.739) 

Analyzing T eaching : Us ing knowledge of 

3.215 

3.306 

3.172 

3.217 

3.094+ 

3.277 

students to infomi teaching and learning 

(0.691) 

(0.733) 

(0.602) 

(0.685) 

(0.717) 

(0.676) 

Academic Language: Identifying and 

3.114 

3.162 

3.017 

3.136 

2.857*** 

3.077 

supporting language demands 

(0.651) 

(0.641) 

(0.491) 

(0.642) 

(0.740) 

(0.618) 

Planning: Planning assessments to 

3.146 

3.282* 

3.328 

3.150 

2.899*** 

3.250 

monitorand support student learning 

(0.711) 

(0.639) 

(0.631) 

(0.709) 

(0.817) 

(0.652) 

Instruction: Learning environment 

3.251 

3.231 

3.414 

3.258 

3.157* 

3.245 

(0.524) 

(0.570) 

(0.552) 

(0.517) 

(0.446) 

(0.589) 

Instruction: Engaging students in 

3.104 

3.139 

3.224 

3.123 

2.916*** 

3.091 

learning 

(0.618) 

(0.579) 

(0.560) 

(0.620) 

(0.602) 

(0.606) 

Instruction: Deeping s tudent learning 

3.060 

3.065 

3.086 

3.068 

2.874** 

3.091 

(0.661) 

(0.552) 

(0.584) 

(0.666) 

(0.723) 

(0.606) 

Instruction: Subject-specific pedagogy: 

3.076 

3.167 

3.138 

3.084 

2.892** 

3.064 

Using representations 

(0.668) 

(0.580) 

(0.533) 

(0.677) 

(0.680) 

(0.639) 

Analyzing T eaching : Analyzing teaching 

2.968 

2.991 

3.121 

2.983 

2.720*** 

2.955 

effectiveness 

(0.724) 

(0.652) 

(0.764) 

(0.726) 

(0.740) 

(0.634) 

Ass essment: Analy sis o f s tudent learning 

3.148 

(0.761) 

3.370** 

(0.718) 

3.138 

(0.625) 

3.168 

(0.745) 

2.790*** 

(0.901) 

3.123 

(0.668) 

Assessment: Providing feedbackto guide 

3.165 

3.236 

3.155 

3.192 

2.860*** 

3.182 

learning 

(0.782) 

(0.715) 

(0.780) 

(0.784) 

(0.829) 

(0.735) 

Assessment: Student use offeedback 

2.711 

(0.760) 

2.801 

(0.752) 

2.707 

(0.575) 

2.738 

(0.760) 

2.472*** 

(0.843) 

2.550** 

(0.717) 

Academic Language: Analyzing student's 

2.830 

2.880 

2.724 

2.852 

2.538*** 

2.909 

language use and subject-specific learning 

(0.696) 

(0.615) 

(0.544) 

(0.691) 

(0.822) 

(0.668) 

Analyzing T eaching : Us ing as sessment 

3.068 

3.296** 

3.121 

3.080 

2.790*** 

3.105 

to inform instruction 

(0.785) 

(0.680) 

(0.715) 

(0.774) 

(0.871) 

(0.824) 

Observations 

2376 

108 

29 

1864 

143 

110 


NOTE: W e omit summary statistics for American Indian, Alaskan Native, and Other/Unspecified race candidates due 
to small cell sizes . Significance stars are froma two sample T -test with unequal variances between white teacher 
candidates andtheraceindicatedby column. +p<.10; *p<.05; **p<.01; ***p<.001.Passingratesforboththepre- 
consequentialperiod andforthe future cut score assume no otherbehavioral changes are associated with thechange of 
cut score orstakes attached to the test. 
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NOTE: All models controls for teacher degree level and test type effects. Average marginal effects calculated fromlogit model in equation 1 Of the full sample 
of 2, 238 teachers, 2,238 teachers take the same tests with at least one otherteacher. Similarly, 2,237 teachers were enrolled inTEPs with at least one other teacher. 
+p<.10; *p<.05; **p<01; ***p<.001 
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Table 4. Value-Added Results 

in Reading (Stacked Model) 












edTPA as 

a screen 





edTPA as 

a signal 



Variables of interest 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

Pas sing in 

0.251** 

0.191* 

0.247*** 










Washington 

Future W ashington 

(0.073) 

(0.080) 

(0.065) 

0.203** 

0.149** 

0.169** 







passing score 

Total score 




(0.058) 

(0.054) 

(0.058) 

0.022 

0.003 

0.006 










(0.017) 

(0.018) 

(0.018) 




Ass essment factor 










0.031+ 

(0.017) 

0.017 

(0.016) 

0.050* 

(0.020) 

Planning factor 










0.020 

0.018 

-0.007 










(0.014) 

(0.014) 

(0.014) 

In s tru c tio n fac tor 










-0.025+ 

-0.028+ 

-0.031 










(0.014) 

(0.016) 

(0.019) 

TEP effects 

District effects 


X 

X 


X 

X 


X 

X 


X 

X 

Teachers 

210 

210 

210 

210 

210 

210 

210 

210 

210 

210 

210 

210 


NOTE: All models control for student prior performance (either both or just lagged or twice lagged score with a missing value dummy for the other) and 
demographics, classroom-level student demographics, teacher degree level, and grade and test type effects. Of the full sample of 210 teachers, 206 take the same 
tests with at least one other teacher. Similarly, 204 and 174 teachers were enrolled in TEPs and employed in districts with at least one other teacher. All standard 
errors are clustered at the teacher level. +p<.10; *p<.05; **p<.01; ***p<.001 
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Table 5. Value-added Results in Math (Stacked Model) 





edTPA as 

a screen 





edTPA as 

a signal 



Variables of interest 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

Pas sing in 

0.038 

0.061 

0.061 










Washington 

Future W ashington 
passing score 

Totalscore 

Ass essment factor 

Planning factor 

In struction factor 

(0.071) 

(0.068) 

(0.058) 

0.052 

(0.045) 

0.085+ 

(0.043) 

0.036 

(0.037) 

0.029+ 

0.035* 

0.015 










(0.015) 

(0.016) 

(0.014) 

-0.004 

(0.026) 

0.060* 

(0.025) 

-0.027 

(0.021) 

0.003 

(0.026) 

0.071+ 

(0.025) 

-0.041+ 

(0.021) 

0.016 

(0.019) 

0.002 

(0.021) 

-0.001 

(0.022) 

TEP effects 

District effects 


X 

X 


X 

X 


X 

X 


X 

X 

Teachers 

206 

206 

206 

206 

206 

206 

206 

206 

206 

206 

206 

206 


NOTE: All models control for studentpriorperfonnance (eitherboth or just laggedor twice lagged score with a missing value dummy forthe other) and 
demographics, classroom-level student demographics, teacher degree level, and grade and test type effects . Of the full s ample of 206 teachers, 202 teachers 
take the same tests with at least one other teacher. Similarly, 201 and 176 teachers were enrolled in TEPs and employed in districts with at least one other 
teacher. All standard errors are clustered at the teacher level. +p<.10; *p<.05; **p<.01; ***p<.00. 
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Table 6. Conditional Probabilities of Teacher Effectiveness Given edTPA Performance 


Stacked Math Sample Stacked Reading Sample 

Fail Pass Fail Pass 


Bottom Quintile 

0.190 

0.202 

0.462** 

0.185 

(0.088) 

(0.029) 

(0.110) 

(0.028) 

Top Quintile 

0.143 

0.202 

0.077 

0.205 

(0.087) 

(0.030) 

(0.111) 

(0.029) 


NOTE: +p<. 10; *p<.05; **p<.01; ***p<.001. Each cell gives the probability that a teacher with 
the indicated perfonnance on the edTPA falls into each quintile of the value-added distribution. 
Standard errors in parentheses. The test of significance is against the null hypothesis that the 
proportion is 0.2. 
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Figures 

Figure 1. Distribution of edTP A scores for White and Hispanic Teacher Candidates 
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Figure 2. Relationship Between edTP A Scores and Probability of Public Teaching Employment 
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Reading Value Added 


Figure 3. Relationship Between edTPA Scores and Reading Value Added 
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Math Value Added 


Figure 4. Relationship Between edTP A Scores and Mathematics Value Added 
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Appendix 


Table Al. Models predicting public teaching employment after the edTPA became consequential 




edTPA as 

a screen 



edTPA as 

a signal 


Variables of interest 

1 

2 

3 

4 

5 

6 

7 

8 

Pas s ing in W ashington 

0.248*** 

(0.059) 

0.197** 

(0.058) 







Future W ashingtonpassing s core 



0.200*** 

(0.038) 

0.163*** 

(0.037) 





Total score 





0.076*** 

(0.012) 

0.058*** 

(0.013) 



Ass essment factor 







0.038** 

(0.017) 

0.03&f 

(0.017) 

Planning factor 







0.004 

(0.016) 

0.009 

(0.016) 

Ins truction factor 







0.042** 

(0.016) 

0.027+ 

(0.015) 

TEP effects 


X 


X 


X 


X 

Teachers 

1,694 

1,694 

1,694 

1,694 

1,694 

1,694 

1,694 

1,694 


NOTE: All models controls for teacher degree level and test typeeffects. Average marginal effects calculated fromlogit model in equationl. 
Of the full sample of 1,694 teachers, 1,694 teachers take the same tests with at least one other teacher. Similarly, 1,693 teachers were enrolled 
in TEPs with at least one other teacher. +p<. 10; *p<.05; **p<.01; ***p<.001 
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Table A2. Value-Added results in math (prioryear test scores only) 

edTPA as a screen edTPA as a signal 

Variables of interest 1 2 3 4 5 6 7 8 9 10 11 12 

0.061 0.056 0.081 

(0.085) (0.077) (0.076) 

0.119* 0.109 0.160*** 

(0.053) (0.055) (0.046) 

0.018 0.029 0.009 
(0.019) (0.018) (0.015) 

-0.015 -0.009 0.035 

(0.033) (0.031) (0.023) 

0.062* 0.068* -0.031 

(0.031) (0.031) (0.030) 

-0.028 -0.030 0.003 

(0.027) (0.026) (0.025) 


TEP effects 


X 



X 



X 



X 


District Effects 



X 



X 



X 



X 

Teachers 

145 

145 

145 

145 

145 

145 

145 

145 

145 

145 

145 

145 


NOTE: All models control for student prior performance and demographics, classroom-level student demographics, teacher degree level, and grade and test type 
effects. Ofthe full sample of 145 teachers, 140teachers take thesame tests with at least one other teacher. Similarly, 137 and 125 teachers were enro lied inTEPs and 
employed in districts with at leas tone other teacher. All standard errors are clustere d at the teacher level. +p<. 10; *p<.05; **p<.01; ***p<.001 


Pass ing in W ashington 
Future Washing) onPass Score 
Total Score 
Ass essment factor 
Planning factor 
Instructionfactor 
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Table A3. Value-Added results in math (multiple imputations model) 

edTPA as a screen edTPA as a signal 


Variables of interest 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

Pass ing in W ashington 

0.086 

(0.073) 

0.094 

(0.072) 

0.133* 

(0.064) 










Future WashingtonPass Score 




0.056 

(0.045) 

0.104* 

(0.043) 

0.043 

(0.039) 







Total Score 







0.020 

(0.017) 

0.033 

(0.018) 

- 0.000 

(0.015) 




Ass essment factor 










-0.029 

(0.026) 

-0.024 

(0.025) 

-0.016 

(0.017) 

Planning factor 










0.074** 

(0.028) 

0.082** 

(0.027) 

0.019 

(0.024) 

In s tru ctio n factor 










-0.022 

(0.023) 

-0.028 

(0.023) 

0.000 

(0.021) 

TEP effects 


X 



X 



X 



X 


District Effects 



X 



X 



X 



X 

Teachers 

157 

157 

157 

157 

157 

157 

157 

157 

157 

157 

157 

157 


NOTE: All models control for student prior performance (either both or jus tor twice lagged score with an imputed prior score ) and demographics, classroom-level 
student demographics, teacher degree level, and grade and test type effects. Of the full sample of 157 teachers, 153 teachers take the same tests with at least one 
other teacher. Similarly, 151 and 124 teachers were enrolledinTEPs and employed in districts with at leas tone other teacher. All standard errors are clustered at the 
teacher level. +p<.10; *p<.05; **p<.01; ***p<.001 
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Table A4. Value-Added results hi reading (prior year test scores only) 

edTPA as a screen edTPA as a signal 

Variables of interest 1 2 3 4 5 6 7 8 9 10 11 12 

0.250** 0.129 0.199* 

(0.086) (0.089) (0.086) 

0.171* 0.092 0.074 

(0.070) (0.066) (0.081) 

0.013 0.005 -0.007 

( 0 . 020 ) ( 0 . 020 ) ( 0 . 021 ) 

0.026 0.010 0.024 

(0.022) (0.020) (0.026) 

0.026 0.024 0.000 

(0.018) (0.017) (0.019) 

-0.033* -0.028 -0.026 

(0.016) (0.017) (0.020) 


TEP effects 


X 



X 



X 



X 


District Effects 



X 



X 



X 



X 

Teachers 

148 

148 

148 

148 

148 

148 

148 

148 

148 

148 

148 

148 


NOTE: All models control for student prior perfonnance and demographics, classroom-level student demographics, teacher degree level, and grade and test type 
effects. Ofthe full sample of 148 teachers, 146 teachers take thesame tests with at least one other teacher. Similarly, 145 and 125 teachers were enrolled inTEPs and 
employed in districts with at leas tone other teacher. All standard errors are clustered at the teacher level. +p<. 10; *p<.05; **p<.01; ***p<.001 


Pass ing in W ashington 
Future Washing) onPass Score 
Total Score 
Ass essment factor 
Planning factor 
Instructionfactor 
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Table A5. Value-Added results hi reading (multiple imputations model) 

edTPA as a screen edTPA as a signal 

Variables of interest 1 2 3 4 5 6 7 8 9 10 11 12 

0.243** 0.152 0.209*** 

(0.077) (0.079) (0.061) 

0.220*** 0.147* 0.166* 

(0.064) (0.060) (0.073) 

0.009 -0.011 -0.018 
(0.017) (0.017) (0.017) 

0.014 -0.007 0.039 

(0.020) (0.018) (0.023) 

0.017 0.013 -0.021 

(0.015) (0.015) (0.013) 

-0.019 -0.017 -0.032 

(0.014) (0.015) (0.016) 


TEP effects 


X 



X 



X 



X 


District Effects 



X 



X 



X 



X 

Teachers 

163 

163 

163 

163 

163 

163 

163 

163 

163 

163 

163 

163 


NOTE: All models control for student prior performance (either both or jus tor twice lagged score with an imputed priorscore ) and demographics, classroom-level 
student demographics, teacher degree level, and grade and test type effects. Of the full sample of 163 teachers, 159 teachers take the same tests with at least one other 
teacher. Similarly, 156 and 125 teachers were enrolled in TEPs and employed in districts with at leas tone other teacher. All standard errors are clustered at the teacher 
level. +p<.10; *p<.05; **p<.01; ***p<.001 
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Table A6. Value-Added results in math (stacked model) after the edTPA became consequential 





edTPA as 

a screen 





edTPA as 

a signal 



Variables of interest 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

Pass ing in W ashington 

0.003 

(0.130) 

0.003 

(0.120) 

0.061 

(0.088) 










Future Washing) onPass Score 




0.018 

(0.064) 

0.071 

(0.070) 

0.012 

(0.045) 







Total Score 







0.034+ 

(0.019) 

0.044+ 

(0.023) 

0.025 

(0.016) 




Ass essment factor 










-0.009 

(0.031) 

0.003 

(0.031) 

-0.004 

(0.025) 

Planning factor 










0.069* 

(0.029) 

0.084** 

(0.029) 

0.030 

(0.023) 

In s tru ctio n factor 










-0.023 

(0.027) 

-0.036 

(0.026) 

0.000 

(0.026) 

TEP effects 


X 



X 



X 



X 


District Effects 



X 



X 



X 



X 

Teachers 

156 

156 

156 

156 

156 

156 

156 

156 

156 

156 

156 

156 


NOTE: All models controlfor student priorperfonnance(either both orjustlagged ortwice lagged score with a missing value dummy forthe other) and 
demographics, classroom-level student demographics, teacher degree level, and grade and testtype effects. Of the full sample of 156 teachers, 151 1 eachers take the 
same tests with at least one otherteacher. Similarly, 149 and 130 teachers were enrolled inTEPs and employed in districts with at least one otherteacher. All standard 
errors are clusteredattheteacher level. +p<.10; *p<.05; **p<.01; ***p<.001 
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Table A7. Value-Added results in reading (stacked model) after the edTPA became consequential 

edTPA as a screen edTPA as a signal 

Variables of interest 1 2 3 4 5 6 7 8 9 10 11 12 

0.262** 0.286** 0.349*** 

(0.091) (0.101) (0.101) 

0.199** 0.189** 0.225*** 

(0.067) (0.059) (0.064) 

0.035* 0.032+ 0.045** 

(0.016) (0.019) (0.016) 

0.045* 0.036* 0.075*** 

(0.018) (0.018) (0.018) 

-0.002 -0.002 -0.025 

(0.017) (0.018) (0.016) 

-0.003 0.004 0.007 

(0.016) (0.018) (0.018) 


TEP effects 


X 



X 



X 



X 


District Effects 



X 



X 



X 



X 

Teachers 

160 

160 

160 

160 

160 

160 

160 

160 

160 

160 

160 

160 


NOTE: All models control for student prior performance (either both or just lagged or twice lagged score with a missing value dummy for the other) and demo graphics, 
clas sroom-level student demographics, teacher degree level, and grade and test type effects. Of the full s ample of 160 teachers, 158 1 eachers take the s ame tests with at 
least one other teacher. Similarly, 157 and 127 teachers were enrolled inTEPs and employed in districts with at leastoneotherteacher. All standard errors are clustered at 
theteacherlevel.+p<.10; *p<.05; **p<.01; ***p<.001 
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